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Abstract 

We propose a method for the approximation of solutions of PDEs with stochas- 
tic coefficients based on the direct, i.e., non-adapted, sampling of solutions. This 
sampling can be done by using any legacy code for the deterministic problem as a 
black box. The method converges in probability (with probabilistic error bounds) 
as a consequence of sparsity and a concentration of measure phenomenon on the 
empirical correlation between samples. We show that the method is well suited for 
truly high-dimensional problems (with slow decay in the spectrum). 
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1 Introduction 



Realistic analysis and design of complex engineering systems require not only 
a fine understanding and modeling of the underlying physics and their inter- 
actions, but also a significant recognition of intrinsic uncertainties and their 
influences on the quantities of interest. Uncertainty Quantification (UQ) is 
an emerging discipline that aims at addressing the latter issue; it aims at 
meaningful characterization of uncertainties in the physical models from the 
available measurements and efficient propagation of these uncertainties for a 
quantitative validation of model predictions. 
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Despite recent growing interests in UQ of complex systems, it remains a 
grand challenge to efficiently propagate uncertainties through systems char- 
acterized by a large number of uncertain sources where the so-called curse- 
of-dimensionality is yet an unsolved problem. Additionally, development of 
n on- intrusive uncertainty propagation techniques is of essence as the analysis 
of complex multi-disciplinary systems often requires the use of sophisticated 
coupled deterministic solvers in which one cannot readily intrude to set up 
the necessary propagation infrastructure. 

Sampling methods such as the Monte Carlo simulation and its several vari- 
ants had been utilized for a long time as the primary scheme for uncertainty 
propagation. However, it is well understood that these methods are gener- 
ally inefficient for large-scale systems due to their slow rate of convergence. 
There has been an increasing recent interest in developing alternative numer- 
ical methods that are more efficient than the Monte Carlo techniques. Most 
notably, the stochastic Galerkin schemes using polynomial chaos (PC) bases 
[36, 27, 61, 2, 58] have been successfully applied to a variety of engineering 
problems and are extremely useful when the number of uncertain parameters 
is not large. In their original form, the stochastic Galerkin schemes are intru- 
sive, as one has to modify the deterministic solvers for their implementation. 
Stochastic collocation schemes [53, 42, 60, 1, 47] belong to a different class of 
methods that rely upon (isotropic) sparse grid integration/interpolation in the 
stochastic space of the problem to reduce the curse-of-dimensionality associ- 
ated with the conventional tensor-product integration/interpolation rules. As 
their construction is primarily based on the input parameter space, the compu- 
tational cost of both stochastic Galerkin and collocation techniques increases 
rapidly for large number of independent input uncertainties. 

More recently, efforts have been made to construct solution-adaptive uncer- 
tainty propagation techniques that exploit any structures in the solution to 
decrease the computational cost. Among them are the multi-scale model reduc- 
tion of [32] and the sparse decomposition of [55, 8, 6, 7, 10] for the stochas- 
tic Galerkin technique, anisotropic and adaptive sparse grids of [46, 41] for 
the stochastic collocation scheme, and low-rank solution approximations of 
[48, 49, 33]. 

In the present study, we are interested in cases where the quantity of interest 
is sparse at the stochastic level, i.e., it can be accurately represented with only 
few terms when linearly expanded into a stochastic, e.g., polynomial chaos, 
basis. Interestingly, sparsity is salient in the analysis of high- dimensional prob- 
lems where the number of energetic basis functions (those with large coeffi- 
cients) is small relative to the cardinality of the full basis. For instance, it has 
been shown in [55, 8] that, under some mild conditions, solutions to linear el- 
liptic stochastic PDEs with high- dimensional random coefficients admit sparse 
representations with respect to the PC basis. Consequently, an approach based 
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on a zero- dimensional algebraic stochastic problem has been proposed in [8] to 
detect the sparsity pattern, which then guides the stochastic Galerkin analysis 
of the original problem. Moreover, a "quasi" -best iV-term approximation for 
a class of elliptic stochastic PDEs has been proposed in [7]. 

In this work, using concentration of measure inequalities and compressive sam- 
pling techniques, we derive a method for PC expansion of sparse solutions to 
stochastic PDEs. The proposed method is 

• Non-intrusive: it is based on the direct random sampling of the PDE solu- 
tions. This sampling can be done by using any legacy code for the deter- 
ministic problem black box. 

• Non-adapted: it does not tailor the sampling process to identify the impor- 
tant dimensions at the stochastic level 

• Provably convergent: we obtain probabilistic bounds on the approximation 
error proving the stability and convergence of the method. 

• Well-suited to problems with high-dimensional random inputs. 

Compressive sampling is an emerging direction in signal processing that aims 
at recovering sparse signals accurately (or even exactly) from a small number 
of their random projections [19, 20, 16, 29, 14, 17, 15, 21, 13]. A sparse sig- 
nal is simply a signal that has only few significant coefficients when linearly 
expanded into a basis, e.g., {V'a}- 

For sufficiently sparse signals, the number of samples needed for a success- 
ful recovery is typically less than what is required by the Shannon-Nyquist 
sampling principle. Generally speaking, a successful signal reconstruction by 
compressive sampling is conditioned upon: 

• Sufficient sparsity of the signal; and 

• Incoherent random projections of the signal. 

A square-measurable stochastic function u(u), defined on a suitable probabil- 
ity space (Q, J 7 , V) can be expanded into a mean-squared convergent series of 
the chaos polynomial bases, i.e., u(u) ~ J2 a c ail>a{u), with some cardinality 
P. The stochastic function u(u) is then sparse in PC basis {ip a }, if only a 
small fraction of coefficients c a are significant. In this case, under certain con- 
ditions, the sparse PC coefficients c may be computed accurately and robustly 
using only N ^ P random samples of u(u) via compressive sampling. Given 
N random samples of u(uj), compressive sampling aims at finding the sparsest 
(or nearly sparsest) coefficients c from an optimization problem of the form 

(P s ,s) '■ min||Wc|| s subject to \\^c — w|| 2 < 5, (1) 

where ||Wc|| S) with s = {0, 1} and some positive diagonal weight matrix W, 
is a measure of the sparsity of c and H^c — u\\ 2 is a measure of the accuracy of 
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the truncated PC expansion in estimating the u(u>) samples. The iV- vector u 
contains the independent random samples of u{oj) and the rows of the N x P 
matrix \I/ consist of the corresponding samples of the PC basis {ip a }- 

Throughout the rest of this manuscript, we will elaborate on the formulation 
of the compressive sampling problem (1) and the required conditions under 
which it leads to an accurate and stable approximation of an arbitrary sparse 
stochastic function as well as sparse solutions to linear elliptic stochastic PDEs. 
Although we choose to study this particular class of stochastic PDEs, we stress 
that the proposed algorithms and theoretical developments are far more gen- 
eral and may be readily applied to recover sparse solution of other stochastic 
systems. 

In Section 2, we describe the setup of the problem of interest. We then, in 
Section 3.1, briefly overview the spectral stochastic discretization of the ran- 
dom functions using PC basis. The main contribution of this work on sparse 
approximation of stochastic PDEs using compressive sampling is then intro- 
duced in Sections 3.2, 3.4, and 3.5. Sections 3.6 and 3.7 discuss some of the 
implementation details of the present technique. To demonstrate the accuracy 
and efficiency of the proposed procedures, in Section 4, we perform two nu- 
merical experiments on a 1-D (in space) linear elliptic stochastic PDE with 
high-dimensional random diffusion coefficients. 



2 Problem setup 

Let (O, J 7 , V) be a complete probability space where V is a probability measure 
on the a— field J 7 . We consider the following elliptic stochastic PDE defined 
on a bounded Lipschitz continuous domain V> C M D , D — 1, 2, or 3, with 
boundary dT>, 

-V • (a(x, w)Vu(x, u)) = f(x) xeV, (2) 
u(x,u) = x G &D, 

V — a.s. uj G fl The diffusion coefficient a(x, u) is a stochastic function defined 
on (Q, J 7 , V) and is the source of uncertainty in (2). We assume that a(x,cu) 
is specified by a truncated Karhunen-Loeve- "like" expansion 

d 

a(x 7 u) = a(x) + ^y/Xi4>i(x)yi(cj), (3) 

i=l 

where (Aj,0j), % = 1, • • • ,d, are the eigenpairs of the covariance function 
C aa (xi,x 2 ) G L 2 (T> x V) of a(x,u) and a(x) is the mean of a(x,u). We 
further assume that a(x,u) satisfies the following conditions: 
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A-I. For all x G V, there exists constants a m i n and a max such that 

< a m i n < a(x, uj) < a max < oo V — a.s. u G f2 (4) 

A-II. The covariance function C aa (a;i, X2) is piecewise analytic on V x V 
[52, 8], implying that there exist real constants C\ and Ci such that for 
z = 1, ■ ■ • ,d, 

< A; < cie -Pai " (5) 

and 

Vck G N d : V^H^^IU 00 ^) ^ c i e_C2iK > ( 6 ) 
where k := 1/D and a G N d is a fixed multi-index. Notice that the decay 
rates in Eqs. (5) and (6) will be algebraic if C aa (xi,X2) has C S (T> x T)) 
regularity for some s > [52]. 

A-III. The random variables {yk(oo) }f =1 are independent and uniformly dis- 
tributed on Tfc := [—1, 1], A; = 1, • ■ ■ , d, with probability distribution func- 
tion pk{yk) = 1/2 defined over T^. The joint probability distribution function 
of the random vector y := (y u ■ ■ ■ , y d ) is then given by p(y) := FlLi Pfc(z/fc)- 

Remark: The algorithm proposed in the paper requires the existence of a 
sparse solution. The only role of assumption A-II is to guarantee the exis- 
tence of such a sparse solution. It is not necessary for the application and the 
validity of the proposed algorithm. In particular, if the coefficient a is only es- 
sentially bounded, the proposed algorithm will be accurate as long as a sparse 
approximation exists. 

Given the finite-dimensional uncertainty representation in (3), the solution 
u(x,lj) of (2) also admits a finite-dimensional representation, i.e., 

u(x, y) := u(x, y x (u), ■ ■ ■ , y d (u)) : V x T R, (7) 

where r:=nLir fe . 

In what follows, we first briefly outline the Legendre spectral stochastic dis- 
cretization of u(x, y) and consequently introduce our approach based on com- 
pressive sampling to obtain such a discretization. 



3 Numerical approach 

3.1 Spectral stochastic discretization 
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In the context of the spectral stochastic methods [36, 27, 61, 2], the solution 
u(x, y) of (2) is represented by an infinite series of the form 



u(x,y)= Y c a {x)ip a {y), (8) 



where Nq := {(chi, ■ ■ ■ , ay) : oij G N U {0}} is the set of multi-indices of size 
d defined on non- negative integers. The basis functions {xf) a (y)} are multi- 
dimensional Legendre polynomials, referred to as the Legendre polynomial 
chaos (PC), and are orthogonal with respect to the joint probability measure 
p(y) of the random vector y. Each basis function ip a (y) is a tensor product 
of univariate Legendre polynomials ip ai {yi) of degree ccj G Nq, i.e., 

My) = ^1(2/1)^00(2/2) ■■■^a d {Vd) 01 G Nq. (9) 

We here assume that the univariate Legendre polynomials ip ai {yi) are also 
normalized such that 

^ 2 ai {yi)Pi{yi)dyi = i, i = i, ••-,<*. (10) 

The exact generalized Fourier coefficients c a (x) in (8), referred to as the PC 
coefficients, are computed by the projection of u(x, y) onto each basis function 
^a(y), 

c a (x) = E [u(x, y)ip a (y)} = J u(x, y)ip a (y)p(y)dy. (11) 

Here, E denotes the expectation operator. In practice, the expansion (8) is 
finite; that is, only a finite number of spectral modes is needed to approximate 
u(x, y) within a given target accuracy. Traditionally, a finite-order truncation 
of the basis {ip a (y)} is adopted for the approximation, i.e., 

u p (x,y):= Y c a {x)tl) a {y), (12) 

where the set of multi-indices A p>d is 

A p>d := {a G N d : ||a||i < p, ||a|| < d} (13) 
and has the cardinality 

Here, ||a||i = Z)i=i a t an d ll^llo = #{2 : «i > 0} are the total order (degree) 
and dimensionality of the basis function ip a (y), respectively. The approxima- 
tion is then refined by increasing p to achieve a given target accuracy. Under 
assumptions A-I, A-II, and A-III stated in Section 2, the solution u(x, y) is an- 
alytic with respect to the random variables {yi}f =1 (see [1]), and as p increases, 
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the approximation (12) converges exponentially fast in the mean-squares sense 
[2, 1, 8]. 

Definition (Sparsity) The solution u{x,y) is said to be (nearly) sparse if 
only a small fraction of coefficients c a (x) in (12) are dominant and contribute 
to the solution statistics. 

As will be described in Section 3.2, a sparse solution u(x, y) may be accurately 
recovered using N <^ P random samples {u(x, using compressive sam- 

pling techniques. This has to be compared, for instance, with the least-squares 
regression- type techniques, [38], that normally require N ^> P samples for an 
accurate recovery. 



3.2 Sparse recovery using compressive sampling 

Compressive sampling is an emerging theory in the field of signal and image 
processing [19, 20, 16, 29, 14, 17, 15, 21, 13]. It hinges around the idea that 
a set of incomplete random observations of a sparse signal can be used to 
accurately, or even exactly, recover the signal (provided that the basis in which 
the signal is sparse is known). In particular, the number of such observations 
may be much smaller than the cardinality of the signal. In the context of 
problem (2), compressive sampling may be interpreted as follows. The solution 
u(x, y) that is sparse, in the sense of Lemma 3.2 defined in Section 3.3, can 
be accurately recovered using N <^ P random samples {u(x,yi)}f =1 , where 
P is the cardinality of the Legendre PC basis {ip a (y)}. We next elaborate on 
the above statement and address how such a sparse reconstruction is achieved 
and under what conditions it is successful. 

Let {u{yi)}f =1 be i.i.d. random samples of u(x, y) for a fixed point x in V. 
For the time being, let us assume that the pth-order PC basis {tf) a (y)} is a 
complete basis to expand u(y); we will relax this assumption as we proceed. 
Given pairs of {yi}fLi and {u(yi)}f =x , we write 

u (Ui) = Yl c ^ a {yi), i = l,---,JV, (15) 

or equivalently, 

Vc = u. (16) 

Here c G M p is the vector of PC coefficients c a to be determined, u G M. N is 
the vector of samples of u(y), and each column of the measurement matrix 
G M 7VxP contains samples of the jth element of the PC basis, i.e., 

*M=^.(2/;), ; = !,-••, iV, j = l,---,P. (17) 
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We are interested in the case that the number N of solution samples is much 
smaller than the unknown PC coefficients P, i.e., N ^ P. Without any ad- 
ditional constraints on c, the underdetermined linear system (16) is ill-posed 
and, in general, has infinitely many solutions. When c is sparse; that is, only 
a small fraction of the coefficients c a are significant, the problem (16) may 
be regularized to ensure a well-posed solution. Such a regularization may be 
achieved by seeking a solution c with the minimum number of non- zeros. This 
can be formulated in the optimization problem 

(Pq) : m i n ll c l|o subject to *&c=u, (18) 

where the semi-norm ||c|| := #{ck : c a ^ 0} is the number of non-zero 
components of c. In general, the global minimum solution of (Pq) is not unique 
and is NP-hard to compute: the cost of a global search is exponential in P. 
Further developments in compressive sampling resulted in a convex relaxation 
of problem (Pq) by minimization of the £i-norm of the solution c instead, i.e., 

(Pi): min||Wc||i subject to = u, (19) 

where W is a diagonal matrix whose [j,j] entry is the ^-norm of the jth 
column of ^ and || • ||i denotes the £i-norm. Notice that the £i-norm is the 
closest convex function to the £ -noTm that compels the small coefficients 
c a to be zero, thus promoting the sparsity in the solution. The purpose of 
weighting the l\ cost function with XV is to prevent the optimization from 
biasing toward the non- zero entries in c whose corresponding columns in ^ 
have large norms. The problem (Pi) is entitled Basis Pursuit (BP) [19] and its 
solution can be obtained by linear programming. Since the £i-norm functional 
||c||i is convex, the optimization problem (Pi) admits a unique solution that 
coincides with the unique solution to problem (P ) for sufficiently sparse c 
with some constraints on the measurement matrix e.g., see [13]. 

In general, the pth-order PC basis is not complete for the exact representation 
of u(y); therefore, we have to account for the truncation error. This can be 
accommodated in (P ) and (Pi) by allowing a non- zero residual in the con- 
straint = u. Therefore, as in Sections 3.2.1 and 3.2.3 of [13], the proposed 
algorithms in this paper are error-tolerant versions of (Pq) and (Pi), with error 
tolerance 5, i.e., 

(Pq,s) : min||c|| subject to ||*c-u|| 2 <5 (20) 

and 

(Pi, 5) : min||Wc||i subject to ||*c — u\\ 2 < 5, (21) 

respectively. The latter problem is named Basis Pursuit Denoising (BPDN) 
in [19] and may be solved using techniques from quadratic programming. We 
leave the discussion on the available algorithms for solving problems (P^s) and 
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(Po,s) to Section 3.7. Instead, we henceforth delineate on sufficient conditions 
under which the BPDN problem (P\ t s) leads to a successful Legendre PC 
expansion of a general essentially bounded sparse stochastic function u(y) 
and, subsequently, the sparse solution u(x, y) to the problem (2). Our results 
are extensions of those in [31, 13], adapted to the case where the measurement 
matrix ^ consists of random evaluations of the Legendre PC basis {ip a }- With 
slight differences that will be remarked accordingly, similar results hold for the 
case of the (Pq,s) problem. 

Theorem 3.1 (General stability of {Pi t s)) Letu(y) be an essentially bounded 
function of i.i.d. random variables y := (yi, ■ ■ ■ , yd) uniformly distributed on 
T := [— 1, Define 

N 

5max := 64P 4 ^(lnP)' (22) 

with 

In 3 p 

c p ,d := 9 i n f(p+<*)'V (23) 



V p\d\ 

Let Up' 5 (y) := J2 a &A p d c-of^aiy) be the Legendre PC approximation of u(y) 
with coefficients c 1 ' 6 computed from the t\-minimization problem (Pi,s) with 
S > 0. If there exists a Legendre PC expansion u p (y) := Sc*eA 6 d c a^a(y), for 

,o 



p.d 

some index set A p d C A p ^ such that 



u — u p 



< e and 

L°°(r) — 



S < 5 max , (24) 
with S := |Ap d |, then with probability 



Prob\ > 1 4.P 2- 2S' m ax pSSmux p—8S m nxP 4Cp,d (25) 



(on the N samples {u{yi)}f =l ) and for some constants c\ and C2, the solution 
Up S must obey 

u - u 1 ' 6 n < cie + c 2 ^. (26) 
p i2 ( r ) ~ WN 



In simple words, Theorem 3.1 states that if an essentially bounded stochastic 
function u(y) admits a sparse Legendre PC expansion, then the £i-minimization 
problem (Pi,s) can accurately recover it from a sufficiently large number of 
random solution samples. The recovery is stable under the truncation error 
\\^c — u|| 2 and is within a distance of the exact solution that is proportional 
to the error tolerance 5. It is worth highlighting that no prior knowledge of the 
sparsity pattern of the PC coefficients c is needed for an accurate recovery. 

Remark: Based on the conditions (22) and (24), the number N of random 
samples has to grow like P 4c p. d In P and also proportional to the number of 
dominant coefficients S = \A e pd \. Given any order p of the PC expansion, for 
sufficiently high-dimensional problems, the constant c p ^ < 1/4 (see Lemma 
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3.5 and Fig. 1), thus justifying N <C P. In fact, the conditions (22) and (24) 
are too pessimistic; in practice, the number of random samples required for 
an accurate recovery is much smaller than the theoretical value in (22). We 
will elaborate on this statement in Section 3.5. 

Remark: Although the BPDN reconstruction is achieved by minimizing the 
^i-norm of the solution, based on (26), the approximation also converges to 
the exact solution in the mean-squares sense. 

Remark: A similar theorem holds for the case of the sparse approximation 
using the £ - mrn i m i za tion problem (Po,s) in (20). In this case, the condition 
(22) has to be replaced with S max := 16p ic^ d ^ nP ^ which is, in theory, milder 
than that of the (P\ t s) problem. The error estimate (26) also holds with a 
larger probability, but with different constants c\ and c 2 . 

3.3 Sparsity of the solution u(x, y) 

Notice that the accurate recovery of u(y) is conditioned upon the existence 
of a sparse PC expansion u p (see Theorem 3.1). In fact, this assumption may 
not hold for an arbitrary stochastic function u(y), as all the elements of the 
basis set {ip a (y)} may be important. In this case, our sparse approximation 
still converges to the actual solution but, perhaps, not using as few as C P 
random solution samples. 

We will now summarize the results of [55, 8] on the sparsity of the Legendre 
PC expansion of the solution u(x,y) to the problem (2). Alternative to the 
pth-order truncated PC expansion of (12), one may ideally seek a proper index 
set A pd C A p d , with sufficiently large p, such that for a given accuracy e 

A p,d : = argmin{|A Pid | : ~A p4 C A p4 , \\u - u p \\ H i {VjLx{r)) < e} , (27) 

in which u p (x, y) := Y^ a eA p d c cx{x)^ a {y). This will lead to the so-called sparse 
approximation oiu(x,y) if 

« \KA = P, (28) 

where A p ^ is defined in (13). Such a reduction in the number of basis functions 
in (28) is possible as, given the accuracy e, the effective dimensionality v of 
u(x, y) in T is potentially smaller than the apparent dimensionality d. More 
precisely, under assumptions A-I, A-II, and A-III stated in Section 2, the 
analyses of [55, 8] imply that the discretization of u(x, y) using a sparse index 
set Ap ;t ,, 

Ap,* := ||a||i < p, ||a|| < v < d] (29) 
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preserves the exponential decay of the approximation error in the (T>, L°°(T)) 
sense. For the sake of completeness, we cite this from [8] in the following 
lemma. 

Lemma 3.2 (Proposition 3.10 of [8]) Given assumptions A-I, A-II, and 
A-III in Section 2, there exist constants Ci, 02,03,04 > 0, depending only on 
a(x,u) and f{x) but independent of d, p, v, such that 

||„, oi II , <-" n („-C2V 1+K _i_ C3u(\nd+\np)-Cip\ ( c i(Y\ 

\\ u - u pA\Hl{V,L™(T)) S Ci [e +e " j ? ^yj 

for any d,p, v e N with v < d and k= 1/D. 
In particular, for d > Cd\ lnej 1 ^, choosing 

Pe = \ Cp d K ] < p and v € = M K/(K+1) 1 < d, (31) 

leads to 

||« — Wp £ ,i/ e ||i?i(i>,i°°(r)) < e ; (32) 
where u v ^ Vt is now defined on a sparse index set 

Ape,^ := (aeNj: ||a||i < p e , ||a|| < v e ) (33) 

with cardinality 

\K,u c \<^ 1/p , (34) 

for some arbitrary large p > and constants q, c p , and independent of 
d,p e , and z/ e [8]. 

In practice, the sparse set A e pd in (27) (or equivalently A PctUc in (33)) is not 
known a priori. In [8], an approach based on an algebraic purely-stochastic 
problem is proposed to adaptively identify d . Having done this, the coeffi- 
cients of the the spectral modes are computed via the (intrusive) stochastic 
Galerkin scheme [36, 61]. Alternatively, in this work, we apply our sparse ap- 
proximation using (Pi t s) and {Po,s) to compute u(x,y). The implementation 
of (Pi t s) and (-£0,5) is non-intrusive; only random samples of the solution are 
needed. Moreover, we do not adapt the sampling process to identify the im- 
portant dimensions at the stochastic level; therefore, our constructions are 
non-adapted. 

Throughout the rest of the present paper, we focus our attention on the case 
of the stochastic PDE (2) whose solution is provably sparse. The statement 
of Theorem 3.1 can be specialized to the approximation of the sparse solution 
to the stochastic PDE (2) using (Pi t s) (or (-Po,<0) as follows. 
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3.4 Stability of (P 1)S ) for stochastic PDE (2) 



Combining Lemma 3.2 with Theorem 3.1 leads to the following theorem. 

Theorem 3.3 (Stability of (P M ) for stochastic PDE (2)) Letu l /(x,y) : 
J2a£A p d c h S (^^a-iy) be the pth-order Legendre PC approximation ofu(x,y) 
in (2) where the coefficients c)^(x) are obtained from the t\-minimization 
problem (Pi t s) with N independent samples of u(x,y) and arbitrary 5 > 0. 
Write k := 1/D. Let c p ^ be defined by (23) and let S max be defined by (22). 
Let p > be arbitrary. 

There exists constants c±, c 2 , c 3 , c 4 , c 5 independent from p, d, k, N such that if 
\c 2 d K ] < p and \c 3 d K ^ K+1 ^] < d, 

then with probability at least 

Prob\ > 1 AP^~ 2 ^ max p—RSmax p—8P 4c P' d Smax (35) 



the solution u 1 ^ 5 must obey 



u - u)f 



< Ca6 + Cr,— (36) 



with 




e := max — — , exp { - { — ] ) ) (37) 

\ Jmax 



3.5 Proofs and further ingredients of successful sparse approximations via 
(P M ) and (P ,a) 



The ability of problems (P\ t s) and (Pq,s) in accurately approximating the 
sparse PC coefficients c in (12), hence the solution u(x,y), depends on two 
main factors: i) the sparsity of the PC coefficients c and ii) the mutual coher- 
ence of the measurement matrix ^. In fact, the number N of random solution 
samples required for a successful sparse approximation is dictated by these two 
factors. While sparsity is a characteristic of the solution of interest u(x,y), 
the mutual coherence of the measurement matrix ^ is universal as it only 
depends on the choice of PC basis {ip a (y)} and the sampling process from 
which ^ is assembled. 



In Section 3.3, based on the analysis of [8], we rationalized the sparsity of 
u(x, y) with respect to the Legendre PC basis. We now give the definition of 
the mutual coherence of ^ and discuss its role in our sparse approximation 
using (P lj<5 ) and (P ,s)- 
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3. 5. 1 Mutual coherence of ^ 



Definition (Mutual Coherence [31]) The mutual coherence fi(^f) of a 
matrix £ M ArxP is the maximum of absolute normalized inner-products of 
its columns. Let i/jj and ift^ be two distinct columns of . Then, 

\4>TiJ>k\ 

:= max „ Y , , ,, • 38 

In plain words, the mutual coherence is a measure of how close to orthogonal 
a matrix is. Clearly, for any general matrix 

< //(*) < 1, (39) 

where the lower bound is achieved, for instance, by unitary matrices. However, 
for the case of N < P, the mutual coherence //(v£) is strictly positive. It is well 
understood that measurement matrices with smaller mutual coherence have 
a better ability to recover a sparse solution using compressive sampling tech- 
niques, e.g., see Lemma 3.6. Therefore, we shall proceed to examine the mutual 
coherence of the random measurement matrix in (16). We first observe that, 
by the orthogonality of the Legendre PC basis, the mutual coherence 
converges to zero almost surely for asymptotically large random sample sizes 
N. However, it is essential for our purpose to i) investigate if a desirably small 
/i(^) can be achieved by a sample size N ^ P and ii) quantify how large 
/i(^) can get for a finite N. These are addressed in the following theorem. 

Theorem 3.4 (Bound on the mutual coherence /i(^)) Let ^ £ M ArxP , 
as defined in (16), be the measurement matrix corresponding to N indepen- 
dent random samples of the Legendre polynomial chaos basis of order p in 
d i.i.d. uniform random variables y. There exists a positive constant c Pi( j : = 
IT^ ! ((p+dy.\ depending on p and d, such that if 



< r = 2 v /C^ 4Cp - d O P)/N < 1 /2, (40) 

for some ( > 1, then 

Prob L(*) > — ^— I < 4P 2 ~ 2C . (41) 
1 — r. 



Figure 1 illustrates the decay of c p ^, for several values of p, as a function 
of d. Based on Theorem 3.4, for cases where the number d of random vari- 
ables y is large enough such that c Pt d < 1/4, it is sufficient to have N ~ 
0(16P ACp d In P) < P to keep /i(\f r ) bounded from above with a large prob- 
ability. Notice that such a requirement on c Pi d is particularly suited to high- 
dimensional problems. 
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' 10 20 30 40 50 60 



d 

Fig. 1. Decay of c Pt d as a function of d. p = 1 (□); p = 2 (o); p = 3 (V); p = 4(o). 

Remark: We observe that, given the choice of r in (40), the upper bound on 
(Ji{&) in (41) decays like 1/y/N for asymptotically large N, which is consistent 
with the Central Limit Theorem. 



In order to prove Theorem 3.4, we first need to compute the maximum of the 
Legendre PC basis functions ip a (y). This is given in the following lemma. 

Lemma 3.5 (Maximum of ip a (y)) Let {ijj a (y)} be the Legendre polynomial 
chaos basis of total order p in d i.i.d uniform random variables y (as defined 
in (13)) and with cardinality P. Then, 

||^«IU~(r) <P Cp - d , (42) 

with a constant 

_ In 3 p 

Proof. Given the equality 
we have 

d 

H^alU-(r) = nv / 2tt7TI. 

Using the constraint ct-i E Nq and Y^=i a i < Pi the right-hand-side is max- 
imized when p of a^s are equal to one and d — p equal to zero (we assume 
d> p). We deduce that 



)a\\L™(T) 



V „ln3 p „„ 

< 32 = p—hTP = P C P,< 



We are now ready to prove Theorem 3.4. 
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Proof of Theorem 3.4- The mutual coherence /x(^ f ) is 



max 

l<j,k<P, 



1/2- 



(44) 



Given the independence of samples {yi\f =1 and using the McDiarmid's in- 
equality, we obtain 



Prob 



N 



N i= i 



> r 



< 2 exp 



-2Nr 2 



<y-k II £°°(r\) y 



(45) 

Using Lemma 3.5, we may probabilistically bound the numerator in (44) as 



Prob 



N 



JV i=l 



> r 



< 2 exp 



2P 4c w 



in which we exploit the orthonormality of ip aj (y) and ^> afc (y), i.e., Ei[i/} aj ip ah ] 
5jk- Similarly, for j = 1, • • • , P, we have 



Prob 
Therefore, 

Prob 



N 



7V i=l 



< 



exp 



—Nr 2 



< 



-Nr 2 



exp 



^Ef =1 ^(^)) 1/2 (^Ef =1 <(y 



1/2 



> 



< 4 exp 



-Nr 2 



2P 4c P,d J 
(46) 



and 



Taking 



Prob 



M*)> 



< 4P 2 exp 



-iVr 2 

2P 4c P.d 



r = 2^CP 4Cp ' d OP)/A r , 
for some £ > 1, we arrive at the statement of the Theorem 3.4. □ 



(47) 
(48) 



To summarize, we observe that with large probability, the mutual coherence 
/i(vf') of the measurement matrix \I/ in (16) can be arbitrarily bounded from 
above by increasing the number N of independent random solution samples. 
Moreover, given the discussions of Section 3.3, we know that the solution to 
problem (2) is sparse in the Legendre PC basis. These are the two key factors 
affecting the stability and accuracy of our sparse approximation. 

Following [31, 13] , we next state a sufficient condition on the sparsity of u(x, y) 
(or, equivalently, the mutual coherence of \I/) such that the problem (Pi t s) leads 
to a stable and accurate sparse solution. By stability, we simply mean that 
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the PC coefficients c 1,<5 recovered from the ^-minimization problem (Pi,<j) do 
not blow up in the presence of the truncation error S. We first assume that x 
is a fixed point in space and subsequently extend the analysis over the entire 
spatial domain V. 

Lemma 3.6 (A condition on sparsity for stability of (Pi : s)) Letu° p (y) : = 
SaeAp d Ca^Pa-iu) with c° = for a. A Pejl/e be the sparse Legendre PC ap- 
proximation of u(x, y) at a spatial point x where the sparse index set A. Pejl/e 
is defined in (33). Assume that the vector of PC coefficients c° satisfies the 
sparsity condition 



\c% 



A„„J<(l + l/ M (*))/4. 



(49) 



Let Up' 5 (y) := J2aeA pd ^a^aiy) be the approximation ofu(y) with coefficients 
c 1 ' 5 computed from the t\-minimization problem [P^^.Then, with probability 
at least 1 ~ ex P(~ 8P ^ p d ) ( on the N samples {u{yj)}f =1 ) and for all 5 > 0, the 



solution ul' s must obey 



E 



u 



1.6 



< 



4 (<5+||*c°-u|| 2 ) 
3~7Vl-/i(*)(4|| C °|| - 



(50) 



Proof. Using Theorem 3.1 of [31], we obtain that if c° satisfies the sparsity 
condition ||c°|| < (1 + l//i(*))/4, then 



£ (ch 5 - c° 



ol\\2 



< 



(S 



U 2 



M*) (4||cP| 



where HV^I^ is the ^-norm of the column of \I/ corresponding to index ex.. The 
presence of 1 1 Vce 1 1 2 ^ s ^ ue ^° ^ ne ^ ac ^ ^ na ^ ^ ne columns of \I/ are not normalized. 
Next, using McDiarmid's inequality and the independence of the entries j] 
for distinct is, we obtain that 



Prob 



\\* 



OL ||2 



< 



3N 



£ (<*• - 



ol || 2 



< exp 



N 



8||^c 114 



j ol\\l<*>(y) 

(51) 

We conclude using Lemma 3.5 and the fact that due to the orthonormality of 



{ip a {y)} we have E 



c° - C 1 ' 5 



□ 



Remark: The error bound in (50) is not tight; in fact, the actual error is 
significantly smaller than the upper bound given in (54). More importantly, 
according to [31], the sparsity condition (49) is unnecessarily too restrictive. 
In practice, both far milder sparsity conditions are needed and much better 
actual errors are achieved. 



1(3 



Remark: We will later use the sparsity condition (49) to derive the suffi- 
cient condition (22) (together with (24)) on the number N of random samples 
needed for a successful recovery. As the condition (49) is too restrictive, the 
theoretical lower bound on N given in (22) and (24) is too pessimistic. 

Remark: According to Lemma 3.6, we do not need to know a priori the sparse 
index set A Pe) „ e ; only the sparsity condition (49) is required. 

Remark: We stated Lemma 3.6 for the case where the sparsity of the PC 
expansion is due to the fact that the effective dimensionality v e is potentially 
smaller than d. However, as far as the stability condition (49) is satisfied, sim- 
ilar stability results are valid for situations where dominant basis are defined 
over all the dimensions. 

Remark: With slight modifications, a similar argument as in Lemma 3.6 may 
be asserted for the solution of £ - mm imization problem {Po,s)- Specifically, in 
that case, we only require a sparsity limit ||c || = |A Pei j, e | < (1 + l//i(^ r ))/2 



to achieve the error estimate E 



«5 - <> 5 



< 



37V l- / u(*)(2||c°|| -l) ' 



Notice that the normalized truncation error 

4, := tf^* (52) 

is the sample average estimate of the mean-squares sparse approximation error 
E (u — Up) 2 at the point x and is a function of samples {y}^Li in addition 
to the order p e and the dimensionality v e of the sparse PC expansion. For a 
given set of N random independent samples {y}f =x and {u(y)}f =x , we may 
bound e 2 N in probability using McDiarmid's inequality, i.e., 



Prob 



i z N > E 



\u W p || Loo( - r ) 



Although our sparse approximations are point-wise in space, we are ultimately 
interested in deriving suitable global stability and error estimates for our 
sparse reconstructions. Such extensions are readily available from Lemma 3.6 
and are stated in the following corollary. 

Corollary 3.7 Let u° p (x,y) := E a e\ p . M c °c l ( x )i'M with c° a (x) = for a <£ 
Ap 6 ,v 6 be the sparse Legendre PC approximation of u(x, y) where the sparse 
index set A Pei „ E is defined in (33). Assume that the vector of PC coefficients 
c°(x) satisfies the sparsity condition ||c°||o = |A PejI/ J < (1 + l//i(^))/4. Let 
Up' s (x,y) := Y,a£K pd cl a( x )' l Pa{y) be the approximation ofu(x,y) with coeffi- 
cients c l,5 (x) computed from the l\-minimization problem (Pi t s) (it any point 
x E V .Then, with probability at least 1 — exp(— &p Z p d ) (on the N samples 
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{u{yi)}f =l ) and for all 5 > 0, the solution u p ,s must obey 



< 



4 (5 + ||*c° - u\\ L 2 



l\v,l\v)) ~ sn 1 - (4||c°|| - I) 



(54) 



Furthermore, the normalized error e 2 N 
above in probability through 

Prob e" A , > u — u 



*c°— u\ 



l2(p,^(k^)) ^ bounded from 



L 2 (X>,L 2 (r)) 



+ r 



< exp -2N- 



\u — u 



0114 

plli 2 (D,L°°(r)) 



(55) 



We have now all the necessary tools to proceed with the proof of our main 
result stated in Theorem 3.3 which is primarily a direct consequence of Lemma 
3.2, Theorem 3.4, and Corollary 3.7. 

Proof of Theorem 3.3. We first note that, given the conditions of Lemma 3.2, 
the solution to problem (2) admits a sparse Legendre PC expansion u p with 

sparsity S = \A PetUe \ < e _1 / p when e > exp (— (j^j ) for some constants 
Ci and (arbitrary) p > 0. Notice that the sparse approximation u p has an 
accuracy better than e in the Hq(T>, L°°(T)) sense. Based on Theorem 3.4, it 
is sufficient to have random solution samples of size N > 64P 4c p- d (lnP)S, to 
meet the sparsity requirement S = |A Pe;i/ J < (1 + l/yu(^))/4, in Corollary 
3.7, with probability at least 1 - AP 2 ~ 2Sm ^ where S max := mp ic^ d (ln p) ■ On 

the other hand, given N random samples of solution, we require e > -577— to 

'-'max 

satisfy the sparsity condition. Given Corollary 3.7 and using the triangular 
and Poincare inequalities, with probability at least 1 — p~ 8Smax ; we have 



u — u 



L 2 {V,L 2 (V)) 



u 



< c v e + 



Hl(V,L 2 {T)) 
5+ ||*C° 



+ 



u 



1,5 



L*<p,L\T)) 



U\ 



L 2 (V,h(^ N )) 



'3N 



for all 5 > 0. Moreover, by choosing r 



u — u 



L 2 (X>,L°°(r)) 



in (55), 



u\ 



-N 



L 2 (D 



N 



< 



u — u 



2 1 

L 2 (D,L 2 (r))~^4 



u — u 



2 5 

< -Cr>e 

L 2 (r>,L°°(r)) — 4 D 



.2 ji 



with probability at least 1 — P 

V5 



_oc p 4c p,d 



Finally, by taking 



c 4 := c v 1 + 



and 



we arrive at the statement of the Theorem 3.3. □ 



c 5 := 



V3Jl - M*) (AS - I) 
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3. 6 Choosing the truncation error tolerance 5 



An important component of the sparse approximation using (Pij) and (-Po,<5) 
is the selection of the truncation error tolerance 5. Although the stability 
bounds given in Lemma 3.6 and Corollary 3.7 are valid for any 5 > 0, the 
actual error and the sparsity level of the solution to (Pi,s) and (Po,s) depend 
on the choice of 5. Ideally, we desire to choose 5 ~ \\*&c — u\\2] while larger 
values of 5 deteriorate the accuracy of the approximation, as in Lemma 3.6, 
smaller choices of 5 may result in over-fitting the solution samples and, thus, 
less sparse solutions. In practice, as the exact values of the PC coefficients 
c° are not known, the exact values of the truncation error \\*S?c° — u\\ and, 
consequently, 5 are not known a priori. Therefore, 5 has to be estimated, for 
instance, using statistical techniques such as the cross-validation [11, 59]. 

In this work, we propose a heuristic cross-validation algorithm to estimate 5. 
We first divide the N available solution samples to N r reconstruction and N v 
validation samples such that N = N r + N v . The idea is to repeat the solution 
of (Pi,s) (or (Pq,s)) on the reconstruction samples and with multiple values of 
truncation error tolerance 5 r . We then set 5 = ^J~-j^-8 r in which 5 r is such that 
the corresponding truncation error on the N v validation samples is minimum. 
This is simply motivated by the fact that the truncation error on the vali- 
dation samples is large for values of 5 r considerably larger and smaller than 
\\*f?c° — u|| 2 evaluated using the reconstruction samples. While the former is 
expected from the upper bound on the approximation error in Lemma 3.6, 
the latter is due to the over-fitting the reconstruction samples. The following 
exhibit outlines the estimation of 5 using the above cross-validation approach: 



Algorithm for cross-validation estimation of 8: 

• Divide the N solution samples to N r reconstruction and N v validation 
samples. 

• Choose multiple values for d~ r such that the exact truncation error 
H^c — u\\2 of the reconstruction samples is within the range of 5 r 
values. 

• For each value of 5 r , solve (Pi,s) (or (Po,s)) on the N r reconstruction 
samples. 

• For each value of 5 r , compute the truncation error 5 V := ||\E , c 1,<5r — u\\2 
(or S v := ||\E , c 0, ' 5r — tx|| 2 ) of the N v validation samples. 

• Find the minimum value of S v and its corresponding 5 r := 5 r . 



Set 6 
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In the numerical experiments of Section 4, we repeat the above cross-validation 
algorithm for multiple replications of the reconstruction and validation sam- 
ples. The estimate of 5 = is then based on the value of S r for which 
the average of the corresponding truncation errors 8 V , over all replications of 
the validation samples, is minimum. This resulted in more accurate solutions 
in our numerical experiments. 



3. 7 Algorithms 



There are several numerical algorithms for solving problems (Pq s s) and (Pi t s) 
each with different optimization kernel, computational complexity, and degree 
of accuracy. An in-depth discussion on the performance of these algorithms is 
outside the scope of the present work; however, below we name some of the 
available options for each problem and briefly describe the algorithms that 
have been utilized in our numerical experiments. For comprehensive discus- 
sions on this subject, the interested reader is referred to [13, 57, 35, 4, 63, 56]. 

Problem {Po,s) : A brute force search through all possible support sets in order 
to identify the correct sparsity for the solution c 0,<5 of (Pq,s) is NP-hard and 
not practical. Greedy pursuit algorithms form a major class of schemes to 
tackle the solution of (Po,s) with a tractable computational cost. Instead of 
performing an exhaustive search for the support of the sparse solution, these 
solvers successively find one or more components of the solution that result in 
the largest improvement in the approximation. Some of the standard greedy 
pursuit algorithms are Orthogonal Marching Pursuit (OMP) [51, 26], Regular- 
ized OMP (ROMP) [45], Stagewise OMP (StOMP) [30], Compressive Sampling 
MP (CoSaMP) [44], Subspace Pursuit [23], and Iterative Hard Thresholding 
(IHT) [25]. Under well-defined conditions, all of the above schemes provide 
stable and accurate solutions to (Po t s) in a reasonable time. 

In the present study, we employ the OMP algorithm to approximate the so- 
lution of (Pq.s)- Starting from c°' S '^ = and an empty active column set 
of at any iteration k, OMP identifies only one column to be added to the 
active column set. The column is chosen such that the £ 2 -norm of the residual, 
||\j> c oA(fc) _ u\\ 2 , is maximally reduced. Having specified the active column set, 
a least-squares problem is solved to compute the solution c°' S '( k \ The iterations 
are continued until the error truncation tolerance 8 is achieved. In general, the 
complexity of the OMP algorithm is 0(S ■ N ■ P) where S := ||c 0,<5 ||o is num- 
ber of non-zero (dominant) entries of c°' s . The following exhibit depicts an 
step-by-step implementation of the OMP algorithm. 
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Orthogonal Matching Pursuit ( OMP) Algorithm: 

• Set k = 0. 

• Set the initial solution c ' 5 '^ = and residual r (0 ) = u - tyc ' 6 ^ = u. 

■ Set the solution support index set 1^ = 0. 

• While || u — ^fc°' s '^ W2 > 5 perform: 

• For all j 1^ evaluate e(j) = \\if>jOij — with ctj = 
^JrW/||^||I. 

. Set k = k + 1. 

■ Update the support index set = IJ {argmin,,- e(j)}. 

■ Solve for c ' 6 '^ = argmin c o,« \\u— ^c°' s \\ 2 subject to Support{c°' s } = 

• Update the residual = u — vj/c ' 5 '^ 

• Output the solution c°' s = c°' S '( k \ 



Although we chose OMP in our analysis, we note that further studies are 
needed to identify the most appropriate greedy algorithm for the purpose of 
this study 

Problem (-Pi, 5).' The majority of available solvers for ^-minimization are based 
on alternative formulations of (Pi,s), such as the ^-norm regularized least- 
squares problem 

(QPx): min-||*c-u||2 + A|| Wc\\ u (56) 
c 2 

or the LASSO problem, [54], 

1 2 

(LS T ) : min - ||\l>c — u\\ 2 subject to ||Wc||i < r. (57) 
c 2 

It can be shown that for an appropriate choice of scalars S, A, and r, the prob- 
lems (Pi,s), (QP\), an d (LS T ) share the same solution [57, 13, 56]. Among 
others, the problem (QP\) is of particular interest as it is an unconstraint op- 
timization problem. Numerous solvers based on the active set [50, 34], interior- 
point continuation [20, 40] and projected gradient [24, 22, 37, 12, 9, 57, 3, 4] 
methods have been developed for solving the above formulations of the t\- 
minimization problem. 

In our numerical experiments, we adopt the Spectral Projected Gradient al- 
gorithm (SPGL1) proposed in [57] and implemented in the MATLAB package 
SPGL1 [5] to solve the ^-minimization problem (P\ t s) m (21). SPGL1 is based 
on exploring the so-called Pareto curve, describing the tradeoff between the 
£ 2 -norm of the truncation error ||^c — u\\ 2 and the (weighted) ^-norm of 
the solution ||Wc||i, for successive solution iterations. At each iteration, the 
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LASSO problem (57) is solved using the spectral projected gradient technique 
with a worst-case complexity of 0(PXriP) where P is the number of columns 
in Given the error tolerance 6, a scalar equation is solved to identify a value 
for r such that the (LS T ) solution of (57) is identical to that of (Pij) in (21). 
Besides being efficient for large-scale systems where may not be available 
explicitly, the SPGL1 algorithm is specifically effective for our application of 
interest as the truncation error — 2 is known only approximately. 

In the next section, we explore some aspects of the proposed scheme through 
its application to a 1-D (in space) elliptic stochastic PDE with high-dimensional 
random diffusion coefficients. 



4 Numerical examples 



We consider the solution of a one- dimensional, i.e., D — 1, version of problem 
(2), 



d ( du(x,u) 

■— \a(x,w) — 

dx \ dx 



1, xeV = (o,i), 



(56 



u(0, u) = u(l, u) = 0, 
where the stochastic diffusion coefficient a(x, u) is given by the expansion 



a(x, lo) = a + a a J2y h<t>i(x)Vi(u) ■ 



(59) 



i=i 



Here, {\i}f =1 and {4>i(x)}f =1 are, respectively, d largest eigenvalues and the 
corresponding eigenfunctions of the Gaussian covariance kernel 



C aa (xi,x 2 ) = exp 



P 



(60) 



in which l c is the correlation length of a(x, u) that prescribes the decay of 
the spectrum of C aa in (60). Random variables {yi(u)}f =1 are assumed to be 
independent and uniformly distributed on [—1,1]. The coefficient a a controls 
the variability of a(x,u). 

We verify the accuracy and efficiency of the present sparse approximation 
schemes for both moderate and high- dimensional diffusion coefficient a(x,u). 
These two cases are obtained, respectively, by assuming (l c ,d) = (1/5,14) 
and (l c ,d) = (1/14,40) in (60) and (59). We further assume that a = 0.1, 
o„ = 0.03 when d = 14, and o n = 0.021 when d = 40. These choices ensure 
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that all realizations of a(x,u) are strictly positive on T> = (0,1). Table 1 
summarizes the assumed parameters for the two test cases. 



Table 1 

Choices of parameters defining the stochastic description of diffusion coefficient 
a(x, (J) in Eq. (59). 



Case 


a 


0"a 


lc 


d 


I 


0.1 


0.030 


1/5 


14 


11 


0.1 


0.021 


1/14 


40 



For both cases, the spatial discretization is done by the Finite Element Method 
using quadratic elements. A mesh convergence analysis is performed to ensure 
that spatial discretization errors are inconsequential. 

The solution statistics are computed using the conventional Monte Carlo sim- 
ulation, the isotropic sparse grid stochastic collocation with the Clenshaw- 
Curtis abscissas [60, 1], and the proposed sparse approximation techniques. We 
use the Basis Pursuit Denoising (BPDN) solver implemented in SPGL1 [5, 57] 
to solve the ^-minimization problem (P\ t s) and the Orthogonal Matching 
Pursuit (OMP) solver in SparseLab [28] to approximate the £ - m inimization 
problem (Po,<s)- We compare the errors in the mean, standard deviation, and 
root mean-squares of the solution error at x = 0.5 using the above methods. 
The details of the analysis are reported below. 



4.1 Case I: d= 14 



We consider an increasing number N = {29,120,200,280,360,421,600} of 
random solution samples to evaluate the solution u at x = 0.5 and, con- 
sequently, to compute the PC coefficients of the solution using l\- and Iq- 
minimization. These samples are nested in the sense that we recycle the pre- 
vious samples when we perform calculations with larger sample sizes. The 
nested sampling property of our scheme is of paramount importance in large 
scale calculations where the computational cost of each solution evaluation 
is enormous. We note that sample sizes N = 29 and N = 421, respectively, 
correspond to the number of nested abscissas in the level I = 1 and level / = 2 
of the isotropic stochastic collocation with the Clenshaw-Curtis rule. 

As elucidated in Section 3.5, the accuracy of our sparse reconstruction depends 
on the mutual coherence /i(^), the sample size N, and the truncation error 
|| — u || 2 (hence 5). In order to reduce the approximation error, we need 
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to reduce \\&c — 2, which may be done by increasing p and, therefore, P. 
However, with a fixed number N of samples, an increase in P may result in 
a larger mutual coherence and, thus, the degradation of the reconstruction 
accuracy. Therefore, in practice, we start by approximating the lower order 
PC expansions when N is small and increase p when larger number of samples 
become available. Notice that such an adaptivity with respect to the order 
p is a natural way of refining the accuracy of PC expansions, for instance, 
when the intrusive stochastic Galerkin scheme is adopted [36]. In particular, 
in this example, for sample sizes N = {29, 120}, we attempt to estimate the 
coefficients of the 3rd-order Legendre PC expansion, i.e. p = 3 and P = 680. 
For larger sample sizes N, we also include the first 320 basis function from the 
4th-order chaos, thus resulting in P = 1000. Since all of the 4th-order basis 
functions are not employed, we need to describe the ordering of our basis 
construction. We sort the elements of {ijj a (y)} such that, for any given order 
p, the random variables yi with smaller indices % contribute first in the basis. 

For each analysis, we estimate the truncation error tolerance 5 based on 
the cross-validation algorithm described in Section 3.6. For each N, we use 
N r « 3N/4 of the samples (reconstruction set) to compute the PC coeffi- 
cients c 1,5r and the rest of the samples (validation set) are used to evaluate 
the truncation error 8 V . The cross-validation is performed for four replications 
of reconstruction and validation sample sets. We then find the value 8 r that 
minimizes the average of 5 V over the four replications of the cross-validation 
samples. Given an estimate of the truncation error tolerance 5 ~ JA/35 r , we 
then use all N samples to compute the coefficients c 1,<5 . 

Figure 2 compares the 'exact' PC coefficients with those obtained using BPDN 
and OMP solvers. We only demonstrate the results corresponding to sample 
sizes N = {120,600}. An 'exact' solution is computed using the level I = 8 
stochastic collocation for which the approximation errors are negligible in 
our comparisons. We observe that BPDN tends to give less sparse solutions 
compared to OMP. This is due to the facts that i) the solution is not exactly 
sparse, i.e., there are many non-zero (although negligible) coefficients c a , ii) 
the l\ cost function does not impose a sufficiently large penalty on the small 
coefficients as does the £0 cost function, and Hi) the truncation error tolerance 
5 may be under-estimated. To reduce this issue, a number of modifications, 
including the reweighted ^-minimization [18, 62, 43, 39], have been introduced 
in the literature that are the subjects of our future work. In contrary, OMP 
results in more sparse solutions as it adds basis function one-at-a-time until 
the residual falls below the truncation error. However, as is seen in Figs. 2 
(b) and (d), a number of small coefficients are still over-estimated. This is 
primarily due to under-estimation of the truncation error tolerance 5 in the 
cross-validation algorithm. 
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Fig. 2. Approximation of polynomial chaos (PC) coefficients c of u(0.5, y) using 
BPDN and OMP for d = 14. (a) BPDN with N = 120 samples, (b) OMP with 
N = 120 samples, (c) BPDN with N = 600 samples, and (d) OMP with N = 600 
samples . 'Exact' coefficients computed from level 8 stochastic collocation with the 
Clenshaw-Curtis abscissas (n); BPDN and OMP (•). 
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Fig. 3. Comparison of relative error in solution statistics at x = 0.5 for the Monte 
Carlo simulation, isotropic sparse grid stochastic collocation with the Clenshaw-Cur- 
tis abscissas, and the proposed sparse approximations (BPDN and OMP) for d = 14. 
Two sets of independent random samples of u(0.5, y) are generated first and are used 
for the Monte Carlo simulation, BPDN, and OMP. The solid and dashed lines corre- 
spond to the first and second sets of samples, respectively, (a) Relative error in the 
mean; (b) Relative error in the standard deviation; (c) Relative root-mean-squares 
(rms) error; and (d) Estimation of 5 using cross-validation: 5 is computed from 5 r 
for which 5 V is minimum. (Monte Carlo simulation (□); stochastic collocation (o); 
BPDN (o); OMP (V)). 

The convergence of the mean, standard deviation, and root mean-squares of 
the approximation error for u(0.5,y) is illustrated in Figs. 3 (a), (b), and 
(c), respectively. For the case of stochastic collocation, we apply sparse grid 
quadrature (cubature) integration rule to directly compute the mean and the 
standard deviation. The root mean-squares error of the Monte Carlo and the 
stochastic collocation solution are evaluated by estimating the corresponding 
PC coefficients using sampling and sparse grid quadrature integration, respec- 
tively, and then comparing them with the exact coefficients. 
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To make a meaningful comparison, for each N, the samples used to com- 
pute the solution statistics by the conventional Monte Carlo, BPDN, and 
OMP are identical. In this sense, the sparse approximation using l\- and £q- 
minimizations may be viewed as only post-processing steps in the Monte Carlo 
simulation. As the sample sizes are finite, the estimates of the PC coefficients 
c are sample dependent and are in fact random. To demonstrate the conver- 
gence of the algorithm with respect to different sets of samples, we repeat the 
analysis for two independent sets of N samples and report the corresponding 
statistics errors with solid and dashed lines. Although for different solution 
samples of size N the estimates of c and the solution statistics are not identi- 
cal, the approximation converges, with large probability, for any set of samples 
with sufficiently large size N (see Theorem 3.3). 

Figure 3 (d) illustrates the statistical estimation of 5 using the cross-validation 
approach described in Section 3.6. The estimation of 6 is slightly different in 
BPDN and OPM, this is a consequence of different reconstruction accuracy 
of these techniques. Moreover, the solution of the BPDN algorithm is less 
sensitive to small perturbations in the truncation error 5 compared to that of 
the OMP algorithm. This is justified by the fact that the ^o-norm is highly 
discontinuous. 

Remark: Despite the conventional implementation of the stochastic colloca- 
tion approach where the approximation refinement requires a certain number 
of extra samples, the l\- and £o _m hiimizat ions may be implemented using ar- 
bitrary numbers of additional samples, which is an advantage, particularly, 
when only a limited number of samples is afforded. 

4.2 Case II: d=40 

The objective of this example is to highlight that a sparse reconstruction may 
lead to significant computational savings for problems with high- dimensional 
random inputs. Similar to the analysis of Case I described in Section 4.1, 
we compute the solution statistics using multiple numbers of independent 
samples. More specifically, we evaluate the solution at x — 0.5 for independent 
samples of size N = {81, 200, 400, 600, 800, 1000}. The number of grid points 
in the level I = 1 and I = 2 of the Clenshaw-Curtis rule in dimension d = 40 
is N = 81 and iV = 3281, respectively. To obtain a reference solution, the 3rd 
order PC coefficients c of the solution at x = 0.5 are computed using level 
I = 5 stochastic collocation with the Clenshaw-Curtis rule. 
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Fig. 4. Approximation of polynomial chaos (PC) coefficients c of u(0.5, y) using 
BPDN and OMP for d = 40. (a) BPDN with N = 200 samples, (b) OMP with 
N = 200 samples, (c) BPDN with N = 1000 samples, and (d) OMP with N = 1000 
samples . 'Exact' coefficients computed from level 5 stochastic collocation with the 
Clenshaw-Curtis abscissas (□); BPDN and OMP (•). 
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Fig. 5. Comparison of relative error in solution statistics at x = 0.5 for the Monte 
Carlo simulation, isotropic sparse grid stochastic collocation with the Clenshaw-Cur- 
tis abscissas, and the proposed sparse approximations (BPDN and OMP) for d = 40. 
Two sets of independent random samples of u(0.5, y) are generated first and are 
used for the Monte Carlo simulation, BPDN, and OMP. The solid and dashed lines 
correspond to the first and second sets of samples, respectively, (a) Relative error 
in mean; (b) Relative error in standard deviation; (c) Relative root-mean-squares 
(rms) error; and (d) Estimation of 5 using cross-validation: 5 is computed from 5 r 
for which 5 V is minimum. (Monte Carlo simulation (n); stochastic collocation (o); 
BPDN (o); OMP (V)). 

For N = {81,200} we only estimate the coefficients associated with the 2nd- 
order PC expansion, i.e. p = 2 and P = 861. For larger sample sizes, we also 
include the first 639 basis functions from the 3th-order chaos, thus leading 
to P = 1500. For each combination of N and p, we estimate the truncation 
error 5 using an identical cross-validation procedure described in Section 4.2. 
Figure 4 illustrates the estimation of PC coefficients of u(0.5, y) with BPDN 
and OMP algorithms with N = 200 and iV = 1000. We again note that the 
recovered solution from the BPDN algorithm is less sparse as compared to that 
of the OMP approach, although the over-estimated coefficients (mostly from 
the second order term) are indeed small. As the samples size is increased, 
we are naturally able to recover more dominant coefficients on the expansion. 
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Figure 5 depicts the convergence of the statistics of the solution as functions 
of the sample size N as well as one instance of the estimation of the truncation 
error tolerance 5. The implementation details are similar to those described 
in Section 4.1 for the case of d = 14. 

It is worth highlighting that the computational saving of the present sparse 
approximations (in terms of the number of samples needed to achieve a certain 
accuracy) compared to the isotropic sparse grid collocation is even larger for 
the higher- dimensional (d = 40) problem. This is due to the fact that the num- 
ber of samples needed to recover the solution using l\- and £ - mmrm i za tion 
is dictated more by the number of dominant terms in the PC expansion com- 
pared to the total number of terms P, as in Theorem 3.1. 



5 Conclusion 

The present study proposes a non-intrusive and non-adapted approach based 
on the compressive sampling formalism for the approximation of sparse so- 
lution of stochastic PDEs. When sufficiently sparse in the polynomial chaos 
(PC) basis, the compressive sampling enables an accurate recovery of the so- 
lution using a number of random solution samples that is significantly smaller 
than the cardinality of the PC basis. Sparse PC approximations based on £ - 
and ^-minimization approaches have been introduced and implemented using 
the Basis Pursuit Denoising (BPDN) and the Orthogonal Matching Pursuit 
(OMP) algorithms, respectively. Probabilistic bounds based on the concentra- 
tion of measure phenomenon have been derived to verify the convergence and 
the stability of the present sparse constructions. The performance and effi- 
ciency of the proposed techniques are explored through their application to a 
linear elliptic PDE with high-dimensional random diffusion coefficients where 
the sparsity of the solution with respect to the PC basis is guaranteed. The 
proposed formalism to recover the sparse PC expansion of stochastic functions 
is not restricted to the case of the elliptic PDEs, as its underlying applicability 
assumptions are universal. Although the discussions of this work have been 
focused on the particular case of the Legendre PC expansions, the proposed 
framework can be readily extended to other bases such as the Hermite PC 
(when the random variables y are standard Gaussian). 
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